Knowledge-Based System for Structured Document Recognition
نویسندگان
چکیده
This paper discribes a document analysis system broadly consisting of a knowledge base, a blackboard and a set of tasks having their own set of spacialists for segmentation, recognition and for inheritance. The knowledge base contains a generic hierarchical description of the document structure in terms of layout objects labeled logically. This allows the generation of hypothetic networks of linked objects in the blackboard. The specialists cooperate indirectly through the blackboard by updating the layout object descriptors. GRAPHEIN is a general-purpose system that could deal effectively with a variety of document classes. It is able to organize and control the diverse document recognition p r e cesses in a flexible and efficient manner. Section 2 presents the classes of document structure adopted and the knowledge sources taken into account in the GRAPHEIN project. The system architecture and the control structure will be detailed respectively in sections 3 and 4. Finally, we conclude with a discussion on the opportunity of such an architecture and propose further improvements. A blackboard modification causes an "event" to propagate up to some specific tasks. A task could then choose another 2 Document structures subset of specialists to carry on with the process. Finally, a synthesized blackboard summary allows a task selector to focus efficiently on the most useful layout object t o process.
منابع مشابه
Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملUsing a Neighbourhood Graph Based on Voronöı Tessellation with DMOS, a Generic Method for Structured Document Recognition
To develop a method for structured document recognition, it is necessary to know the relative position of the graphical elements in a document. In order to deal with this notion, we build a neighbourhood graph based on Voronöı tessellation. We propose to combine the use of this interesting notion of neighbourhood with an existing generic document recognition method, DMOS, which has been used to...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملUrban Vegetation Recognition Based on the Decision Level Fusion of Hyperspectral and Lidar Data
Introduction: Information about vegetation cover and their health has always been interesting to ecologists due to its importance in terms of habitat, energy production and other important characteristics of plants on the earth planet. Nowadays, developments in remote sensing technologies caused more remotely sensed data accessible to researchers. The combination of these data improves the obje...
متن کاملGamera: A Python-based Toolkit for Structured Document Recognition
This paper presents Gamera, a new toolkit for the creation of domain-specific structured document recognition applications by domain experts with limited programming experience. The goal of the Gamera system is to leverage the user’s knowledge of the target documents to create custom applications rather than attempting to meet the needs of diverse users with a monolithic application. The system...
متن کامل